This tutorial is based partially on the tutorial: http://rstudio.github.io/leaflet/choropleths.html
We will be working with a dataset from the United States Census Bureau of National Population Totals and Components of Change: 2010-2017. This dataset contains the estimated population of each state every year from 2010 through 2017. We will be visualizing the percent change in population for each state in 2016 vs 2010.
# load in the csv
pop.estimates <- read.csv("nst-est2017-popchg2010_2017.csv", stringsAsFactors = FALSE)
# drop the columns we don't care about
pop.change <- pop.estimates[c("STATE", "NAME", "POPESTIMATE2010", "POPESTIMATE2017")]
# calculate columns we might want to visualize
pop.change$difference = pop.change$POPESTIMATE2017 - pop.change$POPESTIMATE2010
pop.change$percentagegrowth = pop.change$POPESTIMATE2017 / pop.change$POPESTIMATE2010
# drop the information for overall United States and Regions (we only want states)
pop.change <- pop.change %>% filter(STATE != 0)
head(pop.change)
## STATE NAME POPESTIMATE2010 POPESTIMATE2017 difference
## 1 1 Alabama 4785579 4874747 89168
## 2 2 Alaska 714015 739795 25780
## 3 4 Arizona 6407002 7016270 609268
## 4 5 Arkansas 2921737 3004279 82542
## 5 6 California 37327690 39536653 2208963
## 6 8 Colorado 5048029 5607154 559125
## percentagegrowth
## 1 1.018633
## 2 1.036106
## 3 1.095094
## 4 1.028251
## 5 1.059178
## 6 1.110761
With our data loaded, we are ready to begin with the visualization. We’ll start by loading the GeoJSON information from a JSON file. This JSON file was found by googling for “US states geoJSON”. We’ll use the geojsonio package to load the data into sp objects. The sp package provides classes and methods for dealing with spatial data in R which will let us easily manipulate the geographic features, and their properties.
# download the .json and save it
u <- "eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_500k.json"
downloader::download(url = u, destfile="us-states.geojson")
# use geojsonio to load the spatial data into sp objects
states <- geojsonio::geojson_read("us-states.geojson", what= "sp")
#class(states)
names(states)
## [1] "GEO_ID" "STATE" "NAME" "LSAD" "CENSUSAREA"
#str(states)
Let’s take a look at the organization of our two dataframes.
head(pop.change$NAME)
## [1] "Alabama" "Alaska" "Arizona" "Arkansas" "California"
## [6] "Colorado"
head(as.character(states@data$NAME))
## [1] "Maine" "Massachusetts" "Michigan" "Montana"
## [5] "Nevada" "New Jersey"
The dataframe containing Census Data is organized with the states in alphabetical order, whereas the SpatialPolygonsDataFrame is in another order.
WARNING: There is much misleading information online about how to merge a dataset with your SpatialPolygonsDataFrame. Be careful! R will happily merge these dataframes in a completely crazy order and not throw an error leading to plotting of the wrong data for each state!
# Reorder the data to match the order of the SpatialPolygonsDataFrame
#pop.change <- pop.change[order(match(pop.change$NAME, states@data$NAME)),]
# Add a new column to the SpatialPolygonsDataFrame@data with our data of interest
#states@data$percentagegrowthOLD <- pop.change$percentagegrowth
# Add a new column to the SpatialPolygonsDataFrame@data with our data of interest
states@data <- merge(states@data, pop.change %>% select(NAME, percentagegrowth), by = "NAME")
Let’s start out by visualing the polygons described in our SpatialPolygonsDataFrame.
# provide leaflet with the SpatialPolygonsDataFrame
m <- leaflet(states) %>%
setView(-96, 37.8, 4) %>% # set the view to the contiguous United States
# set what the background map should look like.
#addTiles() # basic
addProviderTiles("Stamen.Watercolor") #FUN
# what do we have so far
m
Almost beautiful enough to stop there. But let’s add the polygons described in our SpatialPolygonsDataFrame.
m %>% addPolygons()
It seems like we just ruined a perfectly good watercolor. This needs some data to redeem the map.
We now want to color by a feautre of our data, the percentage of growth from 2010 to 2017 in each state. First, we need to create our color scale for this data. Let’s split bin on populations that have decreased and increased
We will now create bins based on this range and use those bins to divide a colorscale up.
bins <- c(0, 1.0, Inf)
pal <- colorBin("YlOrRd", domain = states$percentagegrowth, bins = bins)
Now, using the feature data we will color the polygons.
withcolor <- m %>%
addPolygons(
fillColor = ~pal(states$percentagegrowth),
weight = 2,
opacity = 0.5,
color = "white",
dashArray = "3",
fillOpacity = 0.7)
withcolor
It’s a choropleth. But wait! What do all those colors mean?
withcolor %>% addLegend(pal = pal, values = ~states$percentagegrowth, opacity = 0.7, title="Population Growth Since 2010", position = "bottomright")
Better as far as responsible reporting goes. We can quickly see which states had a population decrease in 2017 from 2010. However, this seems to be a waste of the visual space. We could have simply listed states that saw a decrease in population and not used up so much of the page. Let’s make this map more informative. It would be interesting to see differences in the percent increase, 15% population increase in 7 years is quite different than 0.08% increase.
Lab Exercise 1: Play with the binning to make the map more informative.
bins <- c(0, seq(1, 1.15, 0.03), Inf)
pal <- colorBin("YlOrRd", domain = states$percentagegrowth, bins = bins)
withcolor <- m %>%
addPolygons(
fillColor = ~pal(states$percentagegrowth),
weight = 2,
opacity = 0.5,
color = "white",
dashArray = "3",
fillOpacity = 0.7) %>%
addLegend(pal = pal, values = ~states$percentagegrowth, opacity = 0.7, title="Population Growth Since 2010", position = "bottomright")
withcolor
Lab Exercise 2: Aesthetics: Improve the legend, change the color scheme.
Advanced: Find a different provider tile for the background and change the aesthetics to match
Now what this map needs is some interactivity. It’s 2018, you can’t have a visualization without it.
First, we’re going to create a response to hovering over the polygons.
labels <- sprintf(
"<strong>%s</strong><br/>%g",
states$NAME, states$percentagegrowth
) %>% lapply(htmltools::HTML)
hovering <-m %>% addPolygons(
fillColor = ~pal(states$percentagegrowth),
weight = 2,
opacity = 1,
color = "white",
dashArray = "3",
fillOpacity = 0.7,
highlight = highlightOptions(
weight = 5,
color = "#666",
dashArray = "",
fillOpacity = 0.7,
bringToFront = TRUE
)
)
hovering %>% addLegend(pal = pal, values = ~states$percentagegrowth, opacity = 0.7, title="Population Growth Since 2010", position = "bottomright")
Lab Exercise 3: Wow that hover border is gross looking. Please fix it
Finally, we are going to create a popup to provide information while hovering.
labels <- sprintf(
"<strong>%s</strong><br/>%g Percent",
states$NAME, states$percentagegrowth
) %>% lapply(htmltools::HTML)
final <- m %>% addPolygons(
fillColor = ~pal(states$percentagegrowth),
weight = 2,
opacity = 1,
color = "white",
dashArray = "3",
fillOpacity = 0.7,
highlight = highlightOptions(
weight = 5,
color = "#666",
dashArray = "",
fillOpacity = 0.7,
bringToFront = TRUE
),
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
testsize = "15px",
direction = "auto"
)
)
final %>% addLegend(pal = pal, values = ~states$percentagegrowth, opacity = 0.7, title="Population Growth Since 2010", position = "bottomright")
Note* Formating the labelOptions doesn’t work for me. And we’ve done it! An interactive choropleth! Now, does it make sense to use the percent increase? Will we see anything different if we use raw numbers? We originally created a column for the difference in population from 2017 to 2010.
Lab Exercise 4: Swap the data to the raw difference in population Lab Exercise 5: If you haven’t already, change the aesthetics of the map Advanced: Find a dataset at the county level (optional: of Florida). Find a geoJSON with county level information. Use Leaflet to create an interactive map. Challenge: Feeling like your map is looking pretty good? Enter your map to be evaluated by your peers at the end of class for a chance to win a prize.